Reverse-Engineering Instruction Encodings
نویسندگان
چکیده
Binary tools such as disassemblers, just-in-time compilers, and executable code rewriters need to have an explicit representation of how machine instructions are encoded. Unfortunately, writing encodings for an entire instruction set by hand is both tedious and error-prone. We describe derive, a tool that extracts bit-level instruction encoding information from assemblers. The user provides derive with assembly-level information about various instructions. Derive automatically reverse-engineers the encodings for those instructions from an assembler by feeding it permutations of instructions and analyzing the resulting machine code. Derive solves the entire MIPS, SPARC, Alpha, and PowerPC instruction sets, and almost all of the ARM and x86 instruction sets. Its output consists of C declarations that can be used by binary tools. To demonstrate the utility of derive, we have built a code emitter generator that takes derive's output and produces C macros for code emission, which we have then used to rewrite a Java JIT backend.
منابع مشابه
Shingled Graph Disassembly: Finding the Undecidable Path
A probabilistic finite state machine approach to statically disassembling x86 machine language programs is presented and evaluated. Static disassembly is a crucial prerequisite for software reverse engineering, and has many applications in computer security and binary analysis. The general problem is provably undecidable because of the heavy use of unaligned instruction encodings and dynamicall...
متن کاملShingled Graph Disassembly: Finding the Undecideable Path
A probabilistic finite state machine approach to statically disassembling x86 machine language programs is presented and evaluated. Static disassembly is a crucial prerequisite for software reverse engineering, and has many applications in computer security and binary analysis. The general problem is provably undecidable because of the heavy use of unaligned instruction encodings and dynamicall...
متن کاملUnscheduling, Unpredication, Unspeculation: Reverse Engineering Itanium Executables
EPIC (Explicitly Parallel Instruction Computing) architectures, exemplified by the Intel Itanium, support a number of advanced architectural features, such as explicit instruction-level parallelism, instruction predication, and speculative loads from memory. However, compiler optimizations to take advantage of such architectural features can profoundly restructure the program’s code, making it ...
متن کاملRotalumè: A Tool for Automatic Reverse Engineering of Malware Emulators
Malware authors have recently begun using emulation technology to obfuscate their code. They convert native malware binaries into bytecode programs written in a randomly generated instruction set and paired with a native binary emulator that interprets the bytecode. No existing malware analysis can reliably reverse this obfuscation technique. In this paper, we present the first work in automati...
متن کاملOptimizing and Reverse Engineering Itanium Binaries
EPIC (Explicitly Parallel Instruction Computing) architectures, such as the Intel IA-64 (Itanium), address common bottlenecks in modern architectures by supporting novel features such as explicit instruction-level parallelism, predicated instructions, and control and data speculation. While these features promise to make code more efficient, the fact that these new architectural features are vi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001